Evaluation and Analysis of Auditory Model Front Ends for Robust Speech Recognition Program Summary
نویسنده
چکیده
This project was motivated by the need for improved speech recognition in noise, and by expectation that auditory model front ends could make recognition more robust to noise, microphone variation, and speaking style. The project began in FY91 with the initial goal of implementing, evaluating, and comparing four promising auditory front ends: (1) the mean-rate and synchrony outputs of S. Seneff's auditory model; (2) the ensemble interval histogram (EIH) model developed by O. Ghitza; (3) the IMELDA model due to M. Hunt; and (4) an auditory model developed by J. Cohen. The initial effort has included implementation and integration of the first three models with a Lincoln HMM recognizer, and testing with speech in white noise and in a background of speech babble. with white noise and speech babble background. As compared with the mel-cepstrum technique, the auditory models have similar performance at high SNR and slightly better performance at low SNR. In order to select the best features from each front end, dimensionality reduction techniques, including Linear Discriminant Analysis (LDA) and Principal Components Analysis (PCA) have been implemented. LDA has been applied to obtain a transformation matrix for IMELDA, with the transformation matrix developed using 100,000 labeled speech frames (1,000 sentences) from the TIMIT corpus. Testing of IMELDA is about to begin on the TI-105 corpus.
منابع مشابه
Robust parameters for speech recognition based on subband spectral centroid histograms
In this paper we propose a new speech parameterization framework that efficiently combines frequency and magnitude information from the short-term power spectrum of speech. This is achieved through computation of subband spectral centroid histograms (SSCH). Relationship between the proposed method and auditory based speech parameterization methods is discussed. An experimental study on an autom...
متن کاملProperties of Auditory Model Representations
We address the problem of robustness of auditory models as front ends for speech recognition. Auditory models have been referred as superior front ends when speech is corrupted by noise or linear filtering, but there is not yet a deep understanding of its functioning. We analyze some commonly used auditory models and show that they present some interesting properties which are useful for robust...
متن کاملCombined speech enhancement and auditory modelling for robust distributed speech recognition
The performance of Automatic Speech Recognition (ASR) systems in the presence of noise is an area that has attracted a lot of research interest. Additive noise from interfering noise sources, and convolutional noise arising from transmission channel characteristics both contribute to a degradation of performance in ASR systems. This paper addresses the problem of robustness of speech recognitio...
متن کاملComparative evaluations of several front-ends for robust speech recognition
SPEECH RECOGNITION Doh-Suk Kimy, Jae-Hoon Jeongy, Soo-Young Leey, Rhee M. Kilz yDepartment of Electrical Engineering/ zDivision of Basic Science Korea Advanced Institute of Science and Technology 373-1 Kusong-dong, Yusong-gu, Taejon 305-701, Korea E-mail: [email protected] ABSTRACT Zero-crossings with peak amplitudes (ZCPA) model motivated by human auditory periphery is simple compared wi...
متن کاملComparison of Auditory Models for Robust Speech Recognition
Two auditory front ends which emulate some aspects of the human auditory system were compared using a high performance isolated word Hidden Markov Model (HMM) speech recognizer. In these initial studies, auditory models from Seneff [2] and Ghitza [4] were compared using both clean speech and speech corrupted by speech-like "babble" noise. Preliminary results indicate that the auditory models re...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1992